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Abstract 

The "turbo codes", recently proposed by Berrou et. al. are written as a 
disordered spin Hamiltonian. It is shown that there is a threshold G such that 
for signal to noise ratios /iv^ > Q the error probability per bit vanishes in the 
thermodynamic limit, i.e. the limit of infinitly long sequences. The value of the 
threshold has been computed for two particular turbo codes. It is found that it 
depends on the code. These results are compared with numerical simulations. 
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1 Introduction. 



The recent invention of "turbo codes" by Berrou and Glavieux [|^ is considered a major 
breakthrough in communications. For the first time one can communicate almost error- 
free for signal to noise ratios very close to the theoretical bounds of information theory. 
Turbo codes are fastly becoming the new standard for error correcting codes in digital 
communications. The invention of turbo codes and their iterative decoding algorithm was 
empirical. There is no theoretical understanding of why they are so successfuU. The de- 
coding algorithm is thought to be an approximate algorithm. We think that turbo codes 
are interesting, even outside the context of communication theory, because they provide 
a non trivial example of a disordered system which can be studied numerically with a fast 
algorithm. 

In this paper we will study turbo codes and turbo decoding using the modern tools of 
statistical mechanics of disordered systems. One of us has already shown in the past 0] 
that there is a mathematical equivalence between error correcting codes and theoretical 
models of spin glasses. In particular the logarithm of the probability for any given signal, 
conditional on the communication channel output, has the form of a spin glass Hamilto- 
nian. We will construct the Hamiltonian which corresponds to the turbo codes and study 
its properties. This will clarify why they are so successfull. In particular we will show 
that there is a threshold such that for signal to noise ratios j'uP' > Q the average 
error probability per bit Pg vanishes in the thermodynamic limit, i.e. the limit of infinitly 
long sequences. In Pg the average is taken over a large class of turbo codes (see later) and 
over "channel" noise. The rate of these codes is finite. The value of the threshold has 
been computed for two particular turbo codes. It was found that it depends on the code. 
We also compare these results with numerical simulations. 

Our results are typical of the statistical mechanics approach: we study only the average 
performance of turbo codes, not the performance of any particular one. Furthermore 
there exist "very few" particular codes performing "much worse" than the average. 
Let us first briefly remind the connection between error-correction codes and spin-glass 
models. In the mathematical theory of communication both the production of informa- 
tion and its transmission are considered as probabilistic events. A source is producing 
information messages according to a certain probability distribution. Messages of length 
N are sequencies of symbols or "letters of an alphabet" oi, 02, ■ ■ ■ , oat. We will assume 
for simplicity a binary alphabet, i.e. = or 1 and that all symbols are equally probable. 
Instead of Oj we can equally well use Ising spins 



The messages are sent through a noisy transmission channel. If a cr = ±1 is sent through 
the transmission channel, because of the noise, the output will be a real number 

^out^ in 

general different from a. Again, the statistical properties of the transmission channel are 
supposed to be known. Let us call Q(cr°^^|cr)(icr°^^ the probability for the transmission 
channel's output to be between cr'^^^ and cr'^^^+da'^^^ , when the input was cr. Q {cr'^^^^ \(t) 
is supposed to be known. For reasons of simplicity, we assume that the noise is independent 
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for any pair of bits ( "memoryless channel"), i.e. 



Q{a^'''\cT)=l[Q{ar%) (1.2) 



In the case of a memoryless channel and a gaussian noise: 

Shannon calculated the channels capacity C, i.e. the maximum information per use of the 
channel that can be transmitted. 

1 

Cgauss = ^log2(l + -^) (1.4) 

where v"^ is the signal power. 

Under the above assumptions, communication is a statistical inference problem. Given 
the transmission channel's output and the statistical properties of the source and of the 
channel, one has to infer what message was sent. In order to reduce communication errors, 
one may introduce (deterministic) redundancy into the message ( "channel encoding" ) and 
use this redundancy to infer the message sent through the channel ("decoding"). The 
algorithms which transform the source outputs to redundant messages are called error- 
correcting codes. More precisely, instead of sending the N original bits ctj, one sends M 
bits J^, k — 1,---,M,M > N, constructed in the following way 

4'' = (1-5) 

where the "connectivity" matrix C^'^^ has elements zero or one. For any k, all the C^'^^ 

except from one are equal to zero, i.e. the are equal to ±1. C^'^]^^^ defines the code, 
i.e. it tells from which of the cr's to construct the A;th bit of the code. 
This kind of codes are called parity checking codes because J™ counts the parity of the 
minusis among the Ik cr's. The ratio it! = N/M which specifies the redudancy of the code, 
is called the rate of the code. 

Knowing the source probability, the noise probability, the code and the channel output, 
one has to infer the message that was sent. The quality of inference depends on the choice 
of the code. 

According to the famous Shannon's channel encoding theorem, there exist codes which, 
in the limit of infinitly long messages, allow error-free communication, provided the rate 
of the code R is less than the channel capacity C. This theorem says that such "ideal" 
codes exist, but does not say how to construct them. 

We have shown that there exists a close mathematical relationship between error-correcting 
codes and theoretical models of disosdered systems. As we previously said, the output of 
the channel is a sequence of M real numbers J*^^^ = { J^^^, k = 1, ■ ■ ■ , M}, which are 
random variables, obeying the probability distribution Q{Jk^^\Jk^)- Once the channel 
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output J*-*^^ is known, it is possible to compute the probability P(r| J°^^) for any par- 
ticular sequence r = {xj, i = 1, ■ ■ ■ , A^} to be the source output (i.e. the information 
message) . 

More precisely, the equivalence between spin-glass models and error correcting codes is 
based on the following property. 

The probability P(r| J°^*) for any sequence r to be the information message, conditional 
on the channel output J°^^ is given by 

M 

lnP(r|J°^*) = const + ^ 4\ " " " = -~H{t) (1.6) 

fc=i 

where 

^outi 
2 Q{J^~^\ 



B,^B{jr')^\\n ^\i\}^\^ (1.7) 



We recognize in this expression the Hamiltonian of a p-spin spin-glass Hamiltonian. The 
distribution of the couplings is determined by the probability Q(J°^^| J^^). 
In the case when (5(J°^^| J^^) = Q{—J'^^^\ — J^"^) (the case of a "symmetric channel"), 
^^jout^ _ _^^^_jout^ g^j^j Qj^g recovers the invariance of the spin-glass Hamiltonian 
under gauge transformations. 

"Minimum error probability decoding" (or MED), which is widely used in communications 
p|, consists in choosing the most probable sequence t°. This is equivalent to finding the 
ground state of the above spin-glass Hamiltonian. 

Instead of considering the most probable instance, one may only be interested in the most 
probable value rf^^^ of the "bit" (Maximum A posteriori Probability or MAP decoding) 
Q which can be expressed in terms of the magnetization at temperature T = 1/ (3 equal 
to one [HI: 



MAP * / \ 

Tj = sign [rrii) ; rrii 



i 5^r. exp{-if(r)} (1.8) 



where H{t) is defined by Eq. (|1.6| ). 

It is remarkable that (3 = 1 coincides with the Nishimori temperature in spin glasses |]^. 
MAP decoding is an essential ingredient in turbo decoding (see later). 
When all messages are equally probable and the transmission channel is memoryless and 
symmetric, the error probability is the same for all input sequences. It is enough to 
compute it in the case where all input bits are equal to one. In this case, the error 
probability per bit Pg is 

Pe =^(l--^'^^)-^(l-^E^^'^) (1-9) 

and rf^ is the symbol sequence produced by the decoding procedure. One can derive from 
this a very general lower bound for Pg, using the analog of the low temperature expansion. 
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An obvious bound (for zero temperature decoding) is provided by the probability Pe that 
only one bit is incorrect, i.e. tj = —1 while all other bits are correct, i.e. = 1 for all 

Pe > P^^^ = Probability of < ^ fi^ < > (1.10) 

(kenij) J 

where the denotes the set of the couplings in which tj appears. 
A necessary condition for transmitting without errors is that Ylken{j) ^k > with prob- 
ability one. This is only possible if every spin appears in an infinite number of terms in 
the Hamiltonian. Let be the number of spins coupled through the coupling Bf^. 
The total number of spins beeing A^, a spin appears on the average in 

]Lyi, = ^J_yi^ = L (1.11) 

k=l k=l 

terms, where I is the average of Ik (the number of spins coupled together) and R is the 
rate of the code. 

So a necessary condition for a finite rate code to achieve zero error probability, is that the 
average number of spins coupled together diverges in the thermodynamic limit (A^ oo). 
This condition is realised in Derrida's random energy model |^ which has been shown to 
be an ideal code 0] ( in that case R = ). 

We will show in the following that this is also true for the case of recursive turbo codes, 
while it is not true for non recursive turbo codes. 



2 Convolutional codes. 

Convolutional codes are the building blocks of turbo codes. In this section we shall 
describe both non recursive and recursive convolutional codes and the corresponding spin 
models. The information message i.e. the source output (before encoding) will be denoted 
by: 

T = (ri,...,rAr) (2.1) 

It is convenient to think of the source producing a symbol per unit time, i.e. in Tj, i 
denotes the time. For simplicity we consider a code of rate R = 1/2. The encoded 
message has the form: 

J = {j[ \ . . . , J^^; j[ \ . . . , J^^) (2-2) 

Any hardware implementation of a convolutional encoder contains a sequence of r memory 
registers. We shall call r the range of the code. 

Let's denote by Si(t), . . . , Sr(t) the content of the memory registers at time t. At each 
time step the content of each memory register is shifted to the right: 

E,+i(t + l) = S,(t) for j = l,...,r-l (2.3) 
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Moreover for convenience of notation we define 



So(t) = + (2.4) 

and the following sequence of bits which we shall call the register sequence: 

cr = (cTi, . . . ,aiv), (Ti = So(i) (2.5) 

The sequence of the ex's is a function of the source sequence (which may depend on the 
code): 

r ^ cr(T) = (ai(r),...,ajv(T)) (2.6) 

When not ambiguous we shall omit in the following the functional dependence of a upon 
r. 

For non recursive convolutional codes this application is extremely simple: 

a,(T) = So(2) = T, (2.7) 
The encoded message J is easily defined in terms of the content of the register sequence: 

r r 

i = l,...,N ■ n = 1,2 
K{3-n) e {0,1} 

We shall assume hereafter that k(0; 1) = n{0; 2) = 1. 

To avoid redundancy we choose r such that either ^(r; 1) or /t(r; 2) are different from 0. 
To make Eq.( |2.8| ) meaningful for i = 1, . . . , r we define aj = +1 for j < 0. Notice however 
that the exact definition of . . . , Jr^^ is irrelevant in the thermodynamic limit. 
The numbers K{j;n) define the code. Several conventions are used to give them in a 
compact form. A simple and useful one is the following. To each of the two sets of 
numbers 1) and 2) is associated a polynomial on Z2 

r 

9nix) =^K{j;n)x^. (2.9) 

The gn are called generating polynomials. In the same way we can associate a polyno- 
mial to the register sequence {G{x) = Yl!j=i^i^'' j = ^^e source mes- 
sage {H{x) = Yl!j=i'^j^^ ) '0' = ^iid to each part of the encoded message 

(^(")(x) = J2j=id^x^ ; = (-l)'^^"')- With these definitions it is evident that the 
correspondence ( |2.7|) between the source and the register sequences for a non recursive 
convolutional code implies: 

G{x) = H{x) (2.10) 
and the encoding rule ( pTSf) becomes 

g^^\x) = gnix)Gix) = gn{x)H{x) (2.11) 
A few examples are the following: 
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(a). The simplest non trivial convolutional code has range 1: 

= ^i^i^i giix) = l + x (2.12) 
= (T, g2{x) = l (2.13) 



(b). A simple code with range r = 2 whose behavior will be examined in what follows: 

^ = (Ti(Ti_ia,_2 gi{x) = l+x + x^ (2.14) 
jP = cria,^2 ^ g2{x) = l+x^ (2.15) 



(c). The code with range r = 4 used by Berrou and collaborators to build the first 
example of turbo code: 

J-^^ = aiai^iai-2<^i^3ai^4 => gi{x) = 1 + x + x"^ + x^ + x'^ (2.16) 
jf^=a,ai_4 g2{x) = l + x^ (2.17) 



Recursive convolutional codes are most easily defined in terms of the generating polyno- 
mials. The difference with non recursive codes is in the relation between the source and 
the register sequences. In the non recursive case it was given by Eq.( p.7| ) or by Eq.( p.lO[ ). 
In the recursive case one has: 



G(.) = ^ff(x) (2.18) 



SO that Eq. ( p. Ill ) gives 



gW(x) = gi{x)G{x) = H{x) , g^^\x) = g2ix)Gix) = ^//(x) (2.19) 

gi{x) 

Two different recursive codes can be defined by permuting the two polynomials gi and g2- 



It is easy to show that Eq. ( |2.19|) is equivalent to 



r r 

a,(r) = So(2) = r,nS.(0'^^'''^ = r,l[{a,_,r^r,i) (2.20) 

From Eq. (12:201) it follows that: 

r^{^) = fl{^^-,r'^'^''^ = (2.21) 
i=o 

Because of the last equality in Eq.( p.21|) a part of the encoded message (in the recursive 
case) is always the message itself. 
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We shall now consider decoding. Using the method explained in the Introduction the 
probability distribution of the register sequence conditional to some ouput can be written 
as the Boltzmann weight of a spin model with random couplings. The Hamiltonian of 
this model is: 

N f r r 

i=l [ j=0 j=0 

(2.22) 

where B{-) is defined in Eq. (|1.7|) . 

For convolutional codes the model is one dimensional with two types of couplings. The 
range of the interaction coincide with the range of the code. The alert reader will notice 
that the Hamiltonian is expressed as a function of the spins of the register sequence cr,, 
instead of the source sequence Xj used in the introduction. For non recursive codes cTj = r,. 
For recursive codes Xj is given by Eq. ( ^.21| ), i.e. in this last case decoding can be thought 



of as the computation of an expectation value of a composite operator. However the spin 
Hamiltonian is the same for both the recursive and not recursive codes. 
We define the decoding at arbitrary temperature T = 1/(3 as follows: 



sign((r,;(rT))^) (2.23) 
{0{a))p ^ J^^ Yl Q(^) exp{-/3g(cT; J^^^)} (2.24) 



where the expression for Tj(cr) is given by Eq. (|2.21|) or by Eq.( |2.7|) depending whether the 
code is recursive or not. 

As seen in the introduction there are two widely used decoding strategies: 

• Maximum Likelihood decoding which consists in finding the most probable sequence 
of bits and corresponds to the choice /? = oo in Eq.( p.23|) : r/^^ = t^~°° . 



• Maximum A posteriori Probability decoding which consists in finding the most 
probable sequence of bits and corresponds to the choice /3 = 1 in Eq. (|2.23| ): t^^"^^ = 
. This is the strategy which enters in turbo decoding. 

Both this strategies can be implemented in a very efficient way using the transfer matrix 
technique. The corresponding algorithms are known in communication theory as the 
Viterbi algorithm for the P = oo case and the BCJR algorithm [|] for the P = I case. 
The complexity of these algorithms grows like A^2^. 

The use of the register sequence (i.e. of the a variables) makes evident the similarity 
between recursive and nonrecursive codes: they correspond to the same spin model. This 
implies e.g. that, if zero temperature decoding is adopted, the probability of transmitting 
a message without errors is the same with the two codes. 

In the limit r — *• cxo it is possible to construct convolutional codes corresponding to spin 
models with infinite connectivity and couplings between an infinite number of spins. They 
should allow to transmit without errors when the noise is low enough. In practice, because 
of the growing complexity of the transfer matrix algorithm, a compromise between low 
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r's (which are simpler to decode) and high r's (which show better performances) must be 
found. The values of r used in practical cases are between 2 and 7. 
We can write the decoding strategy in terms of the message (i.e. of the r variables) 
without making use of the register sequence (i.e. of the a variables): 



/3 

= sig: 



° \ zj^^\li) ^ " exp{-,3Jf(.(r); J°"t)}| (2.25) 



For non recursive codes, because of Eg. ( p.7|) , things remain unchanged. However, for 
recursive codes, since Eg. ( p^.21[ ) cannot be inverted in a local way, we obtain a non local 
Hamiltonian. 

As a simple illustration of this observation we can consider the Hamiltonian corresponding 
to the code [[al: 



iV N 



H^^\a;J) = -5^i?(J«'°^*)a.a._,-J]S(jf'°^^)a. (2.26) 



i=l i=l 

N N 



H^-\cr{r)-J) = -$:5(J.«'°"^)r.-$:5(4^)'°^^)n-. (2-27) 

1=1 i=l j=l 

For less simple codes we define the numbers p(j) G {0, 1} as follows: 

^Y^p^j^^j (mod 2) (2.28) 

we get 



TV N i-1 

H{a{r)-J) = -$:5(J.«'°"^)r.-$:5(4^)'°^^)n(r.-.)^^^-^ (2.29) 

i=l 1=1 j=0 

Written in this form recursive codes look very different from non recursive ones with 
the same range. If g2{x) is not divisible by gi{x) the corresponding spin models have 
infinite connectivity and interactions with infinite range; they are similar, in this respect, 
to r = oo non recursive codes. 

Neverthless they do not behave, in general, radically better than the non recursive codes 
with the same range because there exists, as we have shown, a change of variables (from 
T to a) which makes the model local. 



3 Turbo codes. 

A turbo code is defined by the choice of a convolutional code and of a permutation of 
objects. We use for the permutation the following notation: 

P:{l,...,Ar} ^ {l,...,Ar} (3.1) 
i ^ P{i) (3.2) 
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and we shall denote by P^^ the inverse permutation {P{P~^{i)) = P~^{P{i)) = i). 
The basic idea is to apply the permution P to the source sequence r to produce a new 
sequence r^. Obviously does not carry any new information because P is known. 
Both sequences r and are the inputs to two set of registers, each one implementing 
a convolutional encoding. In this way the rate of the code is decreased (i.e. greater re- 
dundancy). One can increase the rate by erasing some of the outputs but we will not 
consider this possibility in this paper. 

The properties of the system can strongly depend on the choice of the permutation. 
Permutations "near" the identity give very bad codes. We shall think to a "good" permu- 
tation as to a random permutation. In the limit N ^ oo they are "far" from the identity 
with probability one. We shall discuss this point later in this section. 
We illustrate this idea with the example of a rate 1/2 recursive convolutional code, defined 
by the constants 1) and ^(j; 2). The two register sequences are: 

^r^-^.(r) ^ = n(a« )'^(-^) ^ (3.3) 



n(^a)^'''^ - ^^(-^'^) (3-4) 



where is the permuted message {rf = Tp(^i)). 

The relation between the two register sequences is rather involved and nonlocal for a 
general choice of the permutation. Moreover a^^^ can be expressed only in terms of a 

(2) 

large number of crj s. The identity permutation is clearly an exception since in this case 



Let us consider as an example the code |[a 



-.'"=n-?i,-s.^,^. (3.5) 

It is simple to show that, for a random permutation, the number of different cr'-^'^'s in the 
product on the r.h.s. of Eq.( |3.5| ) is of order 0{N). 

The turbo code defined by the permutation P and by the numbers K,{j] 1) and K,{j] 2) has 
rate 1/3 and the encoded message has the following form: 

r 

j=0 
r 

j=0 

Jf' - i{{^.~Ar')T''-^'' (3.8) 

3=0 

It turns out that it is convenient to write the corresponding Hamiltonian as a function 
of both register sequences. This introduces new degrees of freedom and the Hamiltonian 
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is a function of 2N instead of spin. The unwanted degrees of freedom are ehminated 
by imposing the constraint t[ = Tpi^iy This constraint can be written in terms of the ex's 
using Eqs.( |3.3| ) and ( p.4| ). The probabihty distribution for the register sequences can then 
be written as: 

^i-' ) 1=1 

exp{-if(^T«,0-(2); jOUt)| (3.9) 
N f r 

i=l I j=0 

+ 5(j(l),OUt^-Q^^(l)^.0;2)^ (3.10) 

j=0 

+5(j(2),OUt)-Q(^(2)).0;2) 
j=0 

where S{i,j) is the ordinary Kronecker function. In this way the probabihty distribution 
is a local function of the spin variables cr*-^-* and a^'^\ 

We shall call the code defined by Eqs. (|3l6| , |3r^ , p.8D a non recursive turbo code if K(j; 1) = 
6jfi and a recursive turbo code otherwise. Recursive turbo codes are the ones usually 
called turbo codes in communication theory. 

The probabihty distribution for the recursive turbo code ( |0| ) can't be written in terms 
of one of the two register sequences cr^^^ or cr^^^ without producing large connectivities 
(see Eq.(§;^)). 

If P is the identity permutation then cr^^^ = cr^^) and the code becomes a convolutional 
one with the same rate (1/3) and the same generating polynomials. We shall use the 
convolutional code obtained in this way as a standard comparison term for the perfor- 
mances of turbo codes (see Figs.(^^). The outcome of this comparison (i.e. recursive 
turbo codes have a much lower error probability than convolutional codes) demonstrates 
the importance of the choice of the permutation. 

For non recursive turbo codes the two register sequences are related simply by a permu- 
tation: 

^f^ = ^.(r) = ap-.(,)(r^) = 41(,) (3.11) 

and 



Pnon-rec(-«|J°^*) = — exp{-i/(^«, (a«)^; J^^*)} (3.12) 

N ( r 

if(^«,^(2);J°^t) ^ 5(Jr°^')a«+i?(jf^°^')n(^S)'^^^''^ + 



j=0 



B{Jr'^)ll{a^,r^-^\ (3.13) 

j=0 
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so that the spin model corresponding to this type of code has a finite connectivity c = 

This finite versus infinite connectivity is the essential difference between non recursive 
and recursive turbo codes and explains why recursive turbo codes are so better and why 
they can achieve zero error probability for low enough noise. 

We now discuss decoding. There is no exact decoding algorithm for turbo codes. Berrou et 
al. have proposed a very ingenious algorithm, called turbo decoding, which is thought to 
be approximate. Turbo decoding is an iterative procedure. At each step of the iteration, 
one considers one of the two chains, i.e. either the couplings J*-"-* and J*-^^ or J^^-* and 
J*-^-* and proceeds to MAP decoding. The information so obtained is injected to the next 
step by adding appropriate external fields to the Hamiltonian. The algorithm terminates 
if a fixed point is reached. 

In order to explain the algorithm more precisely, we introduce the following expectation 
values: 



r N r N r ^ 

i Y: eM exp n(^-^)''^^''^ + E B[ W^^-^T^'''^ (3-14) 

(T L i=l j=0 i=l j=0 J 



The Sj's can be computed in an efficient way by using the finite temperature transfer 
matrix algorithm. They are the expectation values of the operator defined by Eqs.( |3.3|) 
or (U). 

Then we introduce the iteration variables: 



r(-)(t) = (rS'")(t),...,r!7)(t)) 



(3.15) 
(3.16) 



for m = 1,2. 

In terms of these variables the iteration reads 





- 1 




- 1 




~ 1 




- 1 



s,[sw + r«(t),s«] 
s,[s(°)'^ + r(2)(t),s(2) 

arctanh 6pli^.-^{t + 1) 
arctanh 6^^.^(1 + 1) 



(0) 



?(0) 



P(i)(^) ^P{i) 



(3.17) 
(3.18) 
(3.19) 

(3.20) 



with 



(m)OUt ^ 



^^^Mout^^ 



m 



0,1,2 



(3.21) 



and5^^.4%. 

The meaning of the previous equations is the following. The 6i are expectation values of 
a sequence of operators which can take only values ±1, computed independently for every 
element of the sequence. The information contained in 6i can therefore be represented 
by an "external field" Fj such that 6i = tanhFj. In order to avoid double counting 
of information one substracts the external fields of the previous iteration as shown in 
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Eqs. (|3.19|J3.20|) . 

Hopefully the iteration converges to a fixed point: 



lim ei'\t) = iim e';>{t)^d: 



(2) 



The decoded message is obtained as follows: 



rf ^^^^ ^ sign( 



(3.22) 



(3.23) 



The system described by Eg. ( ^.9| , |3.10D is seen in turbo decoding as the union of two one 
dimensional subsystem. Each subsystem acts on the other one through a magnetic field 
(in the non recursive case) or through an additional coupling (in the recursive case). 
To get some insight of Eqs.( p.l7| - |3.2CI| ) we define the free energy functionals F^^^ and F^"^^: 

r N r N r 

j^exp <! $:(5(jr) + r.) Ui-.^.r"'' + U(-^-^r'''''' 

j=Q i=l i=0 



i=l 



N 



(3.24) 



N 



i=l 



N 



F^^\e] = ^^,r,-iog(z('-)[r]) 



i=l 



j=0 



i=l 



3=0 



It is then simple to show that 6* is a solution of the equation: 





^ T^turbo f/ai 



where 



turbo 






N 


Fo[e] 






i=l 







1 + x 



log 



1 + x 



1 — x 



log 



1 — X 



(3.25) 
(3.26) 

(3.27) 

(3.28) 
(3.29) 

(3.30) 



Eq. ( p. 281 ) is an approximation to the true free energy functional of the total system which 
is given by: 



AT 



(jW (j(2) i=l 



N 



exp -i7(^«,^(^);j0^t) + 5^r,e.(^«) 

N 

^[0; j«, ^ ^^^,r,-iog(z[r]) 



1=1 



(3.31) 
(3.32) 



_ aiog(z) 
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where i7(cr(i), cr^^); J^^^) is given in Eq.(FTOl). 
It is then evident that 

jrturbo^g,] = jr[6>. j(o)^ j(i)^ 0] + J^[6>; 0, J^'^] - J^[6>; 0, 0] (3.33) 
i.e. turbo decoding neglects terms of order B{J-^^)B{J-^^). 



4 Replica approach. 

We would like to compute the error probability per bit. As explained in the introduction, 
in the case of a symmetric transmission channel, it is enough to compute the magnetization 
in the case of all inputs = 1. The error probability per bit is given by the probability 
of a local magnetization being negative. 

The similarity of the Hamiltonian ( |3.10| ) with the Hamiltonians of disordered spin systems 
is obvious. The disorder in the case of turbo codes has two origins. One is due to the 
(random) permutation which defines the particular code. The other is more conventional 
and is related to the randomness of the couplings which is due to the transmission noise. 
As usual in disordered systems, we can only compute the average over disorder and for 
that we have to introduce replicas. 

Let us define the expectation value of the operator Ti(cr) defined in Eqs.( p.3| , p^) with 
respect to the probability distribution given by Eqs.( p.9|j3.10| ): 



e.[jOut^p] ^ 5^ ^e,(o-«)P(o-«,o-(2)|jO^*) (4.1) 
cr(i) cr(2) 

The statistical properties of a turbo code can be derived from the probability distribution 
of this expectation value: 

Viie\P) = / rfg[J°^^] S {e - 0aJ°^\ P]) z = l,...,N (4.2) 

where 

2 N 
n=0 i=l 

Then we define the average distribution 

^(^) ^jnT^f dQiJ''''^] 5 {e - e,[J^''\ P]) (4.4) 
• p J 

where the sum runs over all possible permutations. V{6) is expected not to depend upon 
the site i in the thermodynamic limit (A^ — * oo). 
The average error probability per bit is given by 

Te= [ ddV{d) (4.5) 
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In any case Pe is an upper bound for the error probability of the "best" code (i.e. the one 
buildt with the permutation which yields the lowest error probability). 
The replicated partition function is given by: 



I " n TV 

yn — 



/It J Y 

■J2h{(t^'^'\(t^'^''';J''''^)] (4.6) 



a=l 



The average over permutations can be done by introducing a matrix representation of the 
permutation 

CJ^VO) ; t,j = l,...,N (4.7) 

To sum over permutations, one sums over all matrices = or 1 with the constrain 
Si ^fi ~ Sj ^ij ~ identity 6{a, r) = (1 + cTr)/2 to write 



n N n N r 

a=l i=l a=l i,j=l 



l-^C^5)+^C^Se.(.«-)e,(.(^)-) 



(4.8) 



It can be shown that the "effective action" which is obtained in this way describes two 

one-dimensional models coupled by a mean field-like interaction. 

This is easily seen by making use of the occupation densities defined below: 



i=l 

e ^ (6^...,6^...,6'^)G{-l,+lr 

= (e,(o-(-)'i),...,e,(o-(")'"),...,e,(o-('")'")) 
The resulting replicated partition function reads [§]: 

{cr(i).«} {cr(2).»} e 

exp 1 - E H{ct'^''>'\ ^t(2)'^ JO^^) + ivE ci(e) logci(e) 



(4.10) 



a=l 



We briefly report here the main results of this approach for the gaussian channel described 
by Eq.( |1.3|) . A detailed analysis will be presented elsewhere 0. 

For recursive turbo codes there exists a low noise phase w"^ < w1 where the error prob- 
ability vanishes in the thermodynamic limit (i.e. for infinitely long sequences). In this 
phase the model is completely ordered: 

V[e) = 5{d-l) (4.11) 
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A local stability analysis yields the critical value wf^^ such that for w"^ > wf^^ the no-error 
phase is destroyed by small fluctuations. Clearly wf^^ > w^. We computed wf^^ for the 
two cases listed below. 

For both the rate is i? = 1/3 so that the Shannon noise threshold as given by Eq.( |1.4|) 
'^'shannon — 1/(2^^^ — 1) ~ —2.31065 db. Error free communication can take place only 
for ^2 < 

• For model [a], defined by Eqs.(p:T^,(P3^ one gets w^^ = l/ln4 ~ 1.41855 db. 

• For model |(b)| , defined by Eqs. (|2.14| ), (|2.15| ) one obtains wf^^ = — l/(21na;c) where 
Xc — 0.741912 ... is the only real solution of the equation 

2x^ + = 1 (4.12) 

The resulting value wf^^ ~ —2.23990 db is quite near to the Shannon threshold. 



5 Discussion. 

We formulated turbo codes as a spin model Hamiltonian and we obtained new results using 
the replica method. It is well known that this method is not mathematically rigorous. So 
it is natural to question the validity of our results. For this purpose we have carried out 
numerical simulations of the following codes: the recursive turbo code corresponding to 
the convolutional code [a] of Sec.(|^), its error probability is reported in Fig.(P; the non 



recursive turbo code obtained by permuting the generating polynomials of the previous 
one (see Fig.(^); the recursive turbo code corresponding to the code |(b)| of the same 
section (see Fig.(|^)). We used the Berrou et al. turbo decoding algorithm and averaged 
over 200 to 500 realizations of the disorder. 

The first conclusion is that recursive turbo codes are much better codes than non recursive 
ones. Furthemore our results for recursive turbo codes are compatible with the existence 
of a threshold such that for w'^ < w1 the error probability per bit is zero, while no such 
threshold seems to exist for non recursive codes. This is in agreement with replica theory. 
Zero error probability can only be achieved in the N —>■ oo limit. Our simulations are for 
= 10^. It would be interesting to perform a detailed study of finite size corrections, i.e. 
of the N dependence of the error probability per bit. 

We now discuss the numerical value of the noise threshold w"^. The first remark is that 
both numerically and analytically, the critical value is below Shannon's bound and that 
it depends on the convolutional code (i.e. on the generating polynomials). The second 
remark is that the analytical value of thresold, wf^^ ~ 1.4186 db is in very good agreement 
with the numerical value for the code |(a)| . For code |(b)| wf^^ ~ —2.240 db while one 



gets ~ —1.7 db from the simulations. It would be interesting to understand this 
disagreement. As we said in the previous section, wioc was calculated by a local stability 
analysis of the ordered phase, i.e. we assumed that the transition is of second order. A 



possible explanation would be that the transition is second order for code |[aj] and first 
order for code |(b)| . Numerical results seem to support this hypothesis, as the variation of 
the error probability as a function of noise is much sharper in case But a much more 
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careful analysis of finite size effects is necessary in order to settle this question numerically. 
One should also look analytically for the occurence of a first order transition. 
Another important issue is the breaking of replica symmetry. Since turbo-decoding is 
thought to be an approximate algorithm, it may be not the best tool to look for replica 
symmetry breaking. We have started an analytical investigation of replica symmetry 
breaking. 
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(2) 



Figure 1: Schematic representation of the encoder for a non recursive convolutional code 
called code |[cj in the text and defined by Eqs.( |2.16D ,( |2.17| ). 




Figure 2: The encoder for the recursive convolutional code with the same generating 
polynomials as in Fig.(|) (cfr. Eqs. (|2.16| ),( p.l7|) ). 
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Figure 3: Schematic representation of the encoder for a recursive turbo code with gener- 
ating polynomials as in the previous figures (cfr. Eqs. (|2.16|) , (|2.17|) ). Notice the presence 
of the interleaver (denoted by P) which implements the permutation. 
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Figure 4: Numerical results for the error probability per bit of the recursive turbo code 
buildt from the convolutional code |(a)] (cfr. Eqs.( |2.1^ ),( |2.13| )). Stars (*) refer to the 
turbo code, diamonds (O) to the convolutional code obtained by setting the permutation 
P equal to the identity permutation, and the continuous line to the uncoded message. 



The leftmost vertical line is located at the Shannon thresold (w"^ = w 
rightmost at the thresold of local stability (w^ = 



see Sec.®). 



Shannons 



while the 
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Figure 5: Numerical results for the error probability per bit of the non recursive turbo 
code buildt from the convolutional code |(a)| (cfr. Eqs. (|2.12| ),( p.l3| )). The symbols have 
the same meaning explained in the caption of Fig.(^. 
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